Search CORE

21 research outputs found

Machine learning approaches in microbiome research: challenges and best practices

Author: Alberto Tonda
Alberto Tonda
Andrea Simeon
Andriy Temko
Eliana Ibrahimi
Georgios Papoutsoglou
Georgios Papoutsoglou
Giacomo Vitali
Julia Eckenberger
Julia Eckenberger
Leo Lahti
Magali Berland
Marcus J. Claesson
Marcus J. Claesson
Marta B. Lopes
Marta B. Lopes
Pierfrancesco Novielli
Pierfrancesco Novielli
Rajesh Shigdel
Sabina Tangaro
Sabina Tangaro
Sonia Tarazona
Stéphane Béreux
Stéphane Béreux
Thomas Klammsteiner
Thomas Klammsteiner
Publication venue: Frontiers Media S.A.
Publication date: 01/09/2023
Field of study

Microbiome data predictive analysis within a machine learning (ML) workflow presents numerous domain-specific challenges involving preprocessing, feature selection, predictive modeling, performance estimation, model interpretation, and the extraction of biological information from the results. To assist decision-making, we offer a set of recommendations on algorithm selection, pipeline creation and evaluation, stemming from the COST Action ML4Microbiome. We compared the suggested approaches on a multi-cohort shotgun metagenomics dataset of colorectal cancer patients, focusing on their performance in disease diagnosis and biomarker discovery. It is demonstrated that the use of compositional transformations and filtering methods as part of data preprocessing does not always improve the predictive performance of a model. In contrast, the multivariate feature selection, such as the Statistically Equivalent Signatures algorithm, was effective in reducing the classification error. When validated on a separate test dataset, this algorithm in combination with random forest modeling, provided the most accurate performance estimates. Lastly, we showed how linear modeling by logistic regression coupled with visualization techniques such as Individual Conditional Expectation (ICE) plots can yield interpretable results and offer biological insights. These findings are significant for clinicians and non-experts alike in translational applications

Directory of Open Access Journals

Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment

The number of microbiome-related studies has notably increased the availability of data on human microbiome composition and function. These studies provide the essential material to deeply explore host-microbiome associations and their relation to the development and progression of various complex diseases. Improved data-analytical tools are needed to exploit all information from these biological datasets, taking into account the peculiarities of microbiome data, i.e., compositional, heterogeneous and sparse nature of these datasets. The possibility of predicting host-phenotypes based on taxonomy-informed feature selection to establish an association between microbiome and predict disease states is beneficial for personalized medicine. In this regard, machine learning (ML) provides new insights into the development of models that can be used to predict outputs, such as classification and prediction in microbiology, infer host phenotypes to predict diseases and use microbial communities to stratify patients by their characterization of state-specific microbial signatures. Here we review the state-of-the-art ML methods and respective software applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on the application of ML in microbiome studies related to association and clinical use for diagnostics, prognostics, and therapeutics. Although the data presented here is more related to the bacterial community, many algorithms could be applied in general, regardless of the feature type. This literature and software review covering this broad topic is aligned with the scoping review methodology. The manual identification of data sources has been complemented with: (1) automated publication search through digital libraries of the three major publishers using natural language processing (NLP) Toolkit, and (2) an automated identification of relevant software repositories on GitHub and ranking of the related research papers relying on learning to rank approach

University of Bergen

NORA - Norwegian Open Research Archives

Diposit Digital de la Universitat de Barcelona

Fondo Bibliográfico Digital Institucional

Advancing microbiome research with machine learning : key findings from the ML4Microbiome COST action

The rapid development of machine learning (ML) techniques has opened up the data-dense field of microbiome research for novel therapeutic, diagnostic, and prognostic applications targeting a wide range of disorders, which could substantially improve healthcare practices in the era of precision medicine. However, several challenges must be addressed to exploit the benefits of ML in this field fully. In particular, there is a need to establish "gold standard" protocols for conducting ML analysis experiments and improve interactions between microbiome researchers and ML experts. The Machine Learning Techniques in Human Microbiome Studies (ML4Microbiome) COST Action CA18131 is a European network established in 2019 to promote collaboration between discovery-oriented microbiome researchers and data-driven ML experts to optimize and standardize ML approaches for microbiome analysis. This perspective paper presents the key achievements of ML4Microbiome, which include identifying predictive and discriminatory 'omics' features, improving repeatability and comparability, developing automation procedures, and defining priority areas for the novel development of ML methods targeting the microbiome. The insights gained from ML4Microbiome will help to maximize the potential of ML in microbiome research and pave the way for new and improved healthcare practices

Utrecht University Repository

Contemporary Challenges and Solutions

CA18131 CP16/00163 NIS-3317 NIS-3318 decision 295741 C18/BM/12585940The human microbiome has emerged as a central research topic in human biology and biomedicine. Current microbiome studies generate high-throughput omics data across different body sites, populations, and life stages. Many of the challenges in microbiome research are similar to other high-throughput studies, the quantitative analyses need to address the heterogeneity of data, specific statistical properties, and the remarkable variation in microbiome composition across individuals and body sites. This has led to a broad spectrum of statistical and machine learning challenges that range from study design, data processing, and standardization to analysis, modeling, cross-study comparison, prediction, data science ecosystems, and reproducible reporting. Nevertheless, although many statistics and machine learning approaches and tools have been developed, new techniques are needed to deal with emerging applications and the vast heterogeneity of microbiome data. We review and discuss emerging applications of statistical and machine learning techniques in human microbiome studies and introduce the COST Action CA18131 “ML4Microbiome” that brings together microbiome researchers and machine learning experts to address current challenges such as standardization of analysis pipelines for reproducibility of data analysis results, benchmarking, improvement, or development of existing and new tools and ontologies.publishersversionpublishe

University of Bergen

Repositório da Universidade Nova de Lisboa

EUR Research Repository

Cork Open Research Archive

NORA - Norwegian Open Research Archives

Open Repository and Bibliography - Luxembourg

Utrecht University Repository

Erciyes University - AVESIS

Riga Stradins university

Fondo Bibliográfico Digital Institucional

Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment

Author: Aasmets Oliver
Berland Magali
Carrillo de Santa Pau Enrique
Claesson Marcus J.
Gruca Aleksandra
Hasic Jasminka
Hron Karel
Karaduzovic-Hadziabdic Kanita
Klammsteiner Thomas
Kolev Mikhail
Lahti Leo
Loncar-Turukalo Tatjana
Lopes Marta B.
Marcos-Zambrano Laura Judith
Moreno Victor
Moreno-Indias Isabel
Naskinova Irina
Org Elin
Paciência Inês
Papoutsoglou Georgios
Przymus Piotr
Shigdel Rajesh
Stres Blaz
Trajkovik Vladimir
Truu Jaak
Tsamardinos Ioannis
Vilne Baiba
Yousef Malik
Zdravevski Eftim
Publication venue: 'Frontiers Media SA'
Publication date: 28/10/2022
Field of study

UTUPub

Statistical and Machine Learning Techniques in Human Microbiome Studies: Contemporary Challenges and Solutions

The human microbiome has emerged as a central research topic in human biology and biomedicine. Current microbiome studies generate high-throughput omics data across different body sites, populations, and life stages. Many of the challenges in microbiome research are similar to other high-throughput studies, the quantitative analyses need to address the heterogeneity of data, specific statistical properties, and the remarkable variation in microbiome composition across individuals and body sites. This has led to a broad spectrum of statistical and machine learning challenges that range from study design, data processing, and standardization to analysis, modeling, cross-study comparison, prediction, data science ecosystems, and reproducible reporting. Nevertheless, although many statistics and machine learning approaches and tools have been developed, new techniques are needed to deal with emerging applications and the vast heterogeneity of microbiome data. We review and discuss emerging applications of statistical and machine learning techniques in human microbiome studies and introduce the COST Action CA18131 "ML4Microbiome" that brings together microbiome researchers and machine learning experts to address current challenges such as standardization of analysis pipelines for reproducibility of data analysis results, benchmarking, improvement, or development of existing and new tools and ontologies

UTUPub

Microbiota in a cooling-lubrication circuit and an option for controlling triethanolamine biodegradation

Author: Heribert Insam (5240999)
Maraike Probst (5241002)
Thomas Klammsteiner (5240996)
Publication venue
Publication date: 21/05/2018
Field of study

Cooling and lubrication agents like triethanolamine (TEA) are essential for many purposes in industry. Due to biodegradation, they need continuous replacement, and byproducts of degradation may be toxic. This study investigates an industrial (1,200 m³) cooling-lubrication circuit (CLC) that has been in operation for 20 years and is supposedly in an ecological equilibrium, thus offering a unique habitat. Next-generation (Illumina Miseq 16S rRNA amplicon) sequencing was used to profile the CLC-based microbiota and relate it to TEA and bicine dynamics at the sampling sites, influent, machine rooms, biofilms and effluent. Pseudomonas pseudoalcaligenes dominated the effluent and influent sites, while Alcaligenes faecalis dominated biofilms, and both species were identified as the major TEA degrading bacteria. It was shown that a 15 min heat treatment at 50°C was able to slow down the growth of both species, a promising option to control TEA degradation at large scale.</p

FigShare

Black Soldier Fly School Workshops as Means to Promote Circular Economy and Environmental Awareness

Author: Andreas Walter
Carina Desirée Heussler
Heribert Insam
Magdalena Gassner
Markus Schermer
Suzanne Kapelari
Thomas Klammsteiner
Publication venue: 'MDPI AG'
Publication date: 17/11/2020
Field of study

Today, insect applications for food and feed are of strong economic, ecological and social interest. Despite their tremendous potential, insects still elicit negative associations in the mindset of Western consumers, which is attributed to a lack of knowledge and scarce opportunities for engagement in this topic. The citizen science project ‘six-legged livestock’ aims to increase the potential of the insect Hermetia illucens (black soldier fly), merging the topics ‘waste re-valorisation’ and ‘protein production’ as a cross-link to circular economy. Workshops were held in four school classes, involving 89 pupils, aged 15 to 18 years old. Making use of organic wastes, participating school classes ran eight rearing systems containing a total of 1800 H. illucens larvae. In the four-week experiments, the pupils monitored larval growth and development. Evidently, the pupils were highly motivated to run their rearing systems and fulfil their working tasks. Furthermore, negative associations with insects, including phobia and scepticism decreased, while excitement for the topic increased after hands-on work with the insects. The presented project may be considered an innovative approach paving the way for the establishment of insects as an important educational tool, since they are still underrepresented in scholarly curricula, despite the public outrage over insect decline

Multidisciplinary Digital Publishing Institute

Microbiota in a cooling-lubrication circuit and an option for controlling triethanolamine biodegradation

Author: Heribert Insam
Lawson GL
Macherey-Nagel
Maraike Probst
Mattsby-Baltzer I
Thomas Klammsteiner
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

Machine learning approaches in microbiome research: challenges and best practices

Author: Berland Magali
Béreux Stéphane
Claesson Marcus, J
Eckenberger Julia
Ibrahimi Eliana
Klammsteiner Thomas
Lahti Leo
Lopes Marta, B
Novielli Pierfrancesco
Papoutsoglou Georgios
Shigdel Rajesh
Simeon Andrea
Tangaro Sabina
Tarazona Sonia
Temko Andriy
Tonda Alberto
Vitali Giacomo
Publication venue: Frontiers Media
Publication date: 22/09/2023
Field of study

International audienceMicrobiome data predictive analysis within a machine learning (ML) workflow presents numerous domain-specific challenges involving preprocessing, feature selection, predictive modeling, performance estimation, model interpretation, and the extraction of biological information from the results. To assist decision-making, we offer a set of recommendations on algorithm selection, pipeline creation and evaluation, stemming from the COST Action ML4Microbiome. We compared the suggested approaches on a multi-cohort shotgun metagenomics dataset of colorectal cancer patients, focusing on their performance in disease diagnosis and biomarker discovery. It is demonstrated that the use of compositional transformations and filtering methods as part of data preprocessing does not always improve the predictive performance of a model. In contrast, the multivariate feature selection, such as the Statistically Equivalent Signatures algorithm, was effective in reducing the classification error. When validated on a separate test dataset, this algorithm in combination with random forest modeling, provided the most accurate performance estimates. Lastly, we showed how linear modeling by logistic regression coupled with visualization techniques such as Individual Conditional Expectation (ICE) plots can yield interpretable results and offer biological insights. These findings are significant for clinicians and nonexperts alike in translational applications

HAL-Paris1

HAL-Polytechnique